Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 684 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 111.7 KiB |
| Average record size in memory | 167.3 B |
Variable types
| NUM | 6 |
|---|---|
| UNSUPPORTED | 4 |
| CAT | 2 |
Reproduction
| Analysis started | 2020-07-26 14:52:10.980000 |
|---|---|
| Analysis finished | 2020-07-26 14:52:29.564000 |
| Duration | 18.58 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
Id has unique values | Unique |
TotalBilirubin is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
DirectBilirubin is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
AlkphosAlkalinePhosphotase is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
AlbuminGlobulinRatio is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
| Distinct count | 684 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 342.5 |
|---|---|
| Minimum | 1 |
| Maximum | 684 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 35.15 |
| Q1 | 171.75 |
| median | 342.5 |
| Q3 | 513.25 |
| 95-th percentile | 649.85 |
| Maximum | 684 |
| Range | 683 |
| Interquartile range (IQR) | 341.5 |
Descriptive statistics
| Standard deviation | 197.5980769 |
|---|---|
| Coefficient of variation (CV) | 0.5769286917 |
| Kurtosis | -1.2 |
| Mean | 342.5 |
| Median Absolute Deviation (MAD) | 171 |
| Skewness | 0 |
| Sum | 234270 |
| Variance | 39045 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 684 | 1 | 0.1% | |
| 225 | 1 | 0.1% | |
| 233 | 1 | 0.1% | |
| 232 | 1 | 0.1% | |
| 231 | 1 | 0.1% | |
| 230 | 1 | 0.1% | |
| 229 | 1 | 0.1% | |
| 228 | 1 | 0.1% | |
| 227 | 1 | 0.1% | |
| 226 | 1 | 0.1% | |
| 224 | 1 | 0.1% | |
| 214 | 1 | 0.1% | |
| 223 | 1 | 0.1% | |
| 222 | 1 | 0.1% | |
| 221 | 1 | 0.1% | |
| 220 | 1 | 0.1% | |
| 219 | 1 | 0.1% | |
| 218 | 1 | 0.1% | |
| 217 | 1 | 0.1% | |
| 216 | 1 | 0.1% | |
| 234 | 1 | 0.1% | |
| 235 | 1 | 0.1% | |
| 236 | 1 | 0.1% | |
| 237 | 1 | 0.1% | |
| 254 | 1 | 0.1% | |
| Other values (659) | 659 | 96.3% |
| Value | Count | Frequency (%) | |
| 1 | 1 | 0.1% | |
| 2 | 1 | 0.1% | |
| 3 | 1 | 0.1% | |
| 4 | 1 | 0.1% | |
| 5 | 1 | 0.1% | |
| 6 | 1 | 0.1% | |
| 7 | 1 | 0.1% | |
| 8 | 1 | 0.1% | |
| 9 | 1 | 0.1% | |
| 10 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 684 | 1 | 0.1% | |
| 683 | 1 | 0.1% | |
| 682 | 1 | 0.1% | |
| 681 | 1 | 0.1% | |
| 680 | 1 | 0.1% | |
| 679 | 1 | 0.1% | |
| 678 | 1 | 0.1% | |
| 677 | 1 | 0.1% | |
| 676 | 1 | 0.1% | |
| 675 | 1 | 0.1% |
Age
Real number (ℝ≥0)
| Distinct count | 72 |
|---|---|
| Unique (%) | 10.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 45.099415204678365 |
|---|---|
| Minimum | 4 |
| Maximum | 90 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.4 KiB |
Quantile statistics
| Minimum | 4 |
|---|---|
| 5-th percentile | 18 |
| Q1 | 33 |
| median | 45 |
| Q3 | 58 |
| 95-th percentile | 72 |
| Maximum | 90 |
| Range | 86 |
| Interquartile range (IQR) | 25 |
Descriptive statistics
| Standard deviation | 16.30758699 |
|---|---|
| Coefficient of variation (CV) | 0.3615919834 |
| Kurtosis | -0.5891489782 |
| Mean | 45.0994152 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | -0.03254672964 |
| Sum | 30848 |
| Variance | 265.9373935 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 60 | 41 | 6.0% | |
| 45 | 27 | 3.9% | |
| 38 | 26 | 3.8% | |
| 42 | 24 | 3.5% | |
| 50 | 24 | 3.5% | |
| 48 | 22 | 3.2% | |
| 32 | 22 | 3.2% | |
| 40 | 21 | 3.1% | |
| 58 | 20 | 2.9% | |
| 55 | 20 | 2.9% | |
| 33 | 18 | 2.6% | |
| 65 | 18 | 2.6% | |
| 46 | 18 | 2.6% | |
| 75 | 17 | 2.5% | |
| 26 | 15 | 2.2% | |
| 18 | 14 | 2.0% | |
| 51 | 13 | 1.9% | |
| 35 | 13 | 1.9% | |
| 57 | 13 | 1.9% | |
| 62 | 13 | 1.9% | |
| 66 | 13 | 1.9% | |
| 34 | 12 | 1.8% | |
| 49 | 12 | 1.8% | |
| 30 | 11 | 1.6% | |
| 36 | 11 | 1.6% | |
| Other values (47) | 226 | 33.0% |
| Value | Count | Frequency (%) | |
| 4 | 2 | 0.3% | |
| 6 | 1 | 0.1% | |
| 7 | 2 | 0.3% | |
| 8 | 1 | 0.1% | |
| 10 | 1 | 0.1% | |
| 11 | 1 | 0.1% | |
| 12 | 2 | 0.3% | |
| 13 | 5 | 0.7% | |
| 14 | 3 | 0.4% | |
| 15 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 90 | 1 | 0.1% | |
| 85 | 2 | 0.3% | |
| 84 | 2 | 0.3% | |
| 78 | 1 | 0.1% | |
| 75 | 17 | 2.5% | |
| 74 | 5 | 0.7% | |
| 73 | 2 | 0.3% | |
| 72 | 10 | 1.5% | |
| 70 | 10 | 1.5% | |
| 69 | 2 | 0.3% |
Gender
Categorical
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 KiB |
| Male | |
|---|---|
| Female |
| Value | Count | Frequency (%) | |
| Male | 518 | 75.7% | |
| Female | 166 | 24.3% |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.485380117 |
| Min length | 4 |
Most occurring characters
| Value | Count | Frequency (%) | |
| e | 850 | 27.7% | |
| a | 684 | 22.3% | |
| l | 684 | 22.3% | |
| M | 518 | 16.9% | |
| F | 166 | 5.4% | |
| m | 166 | 5.4% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 2384 | 77.7% | |
| Uppercase Letter | 684 | 22.3% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| M | 518 | 75.7% | |
| F | 166 | 24.3% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| e | 850 | 35.7% | |
| a | 684 | 28.7% | |
| l | 684 | 28.7% | |
| m | 166 | 7.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 3068 | 100.0% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| e | 850 | 27.7% | |
| a | 684 | 22.3% | |
| l | 684 | 22.3% | |
| M | 518 | 16.9% | |
| F | 166 | 5.4% | |
| m | 166 | 5.4% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 3068 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| e | 850 | 27.7% | |
| a | 684 | 22.3% | |
| l | 684 | 22.3% | |
| M | 518 | 16.9% | |
| F | 166 | 5.4% | |
| m | 166 | 5.4% |
SgptAlamineAminotransferase
Real number (ℝ≥0)
| Distinct count | 152 |
|---|---|
| Unique (%) | 22.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 86.59356725146199 |
|---|---|
| Minimum | 10 |
| Maximum | 2000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.4 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 15 |
| Q1 | 23 |
| median | 36 |
| Q3 | 62 |
| 95-th percentile | 308 |
| Maximum | 2000 |
| Range | 1990 |
| Interquartile range (IQR) | 39 |
Descriptive statistics
| Standard deviation | 196.577834 |
|---|---|
| Coefficient of variation (CV) | 2.270120521 |
| Kurtosis | 43.21768505 |
| Mean | 86.59356725 |
| Median Absolute Deviation (MAD) | 16 |
| Skewness | 6.115990795 |
| Sum | 59230 |
| Variance | 38642.84482 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 25 | 28 | 4.1% | |
| 20 | 27 | 3.9% | |
| 22 | 22 | 3.2% | |
| 21 | 19 | 2.8% | |
| 28 | 18 | 2.6% | |
| 18 | 18 | 2.6% | |
| 48 | 16 | 2.3% | |
| 15 | 16 | 2.3% | |
| 31 | 15 | 2.2% | |
| 30 | 15 | 2.2% | |
| 24 | 15 | 2.2% | |
| 27 | 13 | 1.9% | |
| 33 | 13 | 1.9% | |
| 32 | 12 | 1.8% | |
| 29 | 12 | 1.8% | |
| 36 | 12 | 1.8% | |
| 26 | 11 | 1.6% | |
| 14 | 11 | 1.6% | |
| 42 | 11 | 1.6% | |
| 12 | 11 | 1.6% | |
| 35 | 11 | 1.6% | |
| 37 | 11 | 1.6% | |
| 16 | 11 | 1.6% | |
| 38 | 10 | 1.5% | |
| 40 | 10 | 1.5% | |
| Other values (127) | 316 | 46.2% |
| Value | Count | Frequency (%) | |
| 10 | 4 | 0.6% | |
| 11 | 2 | 0.3% | |
| 12 | 11 | 1.6% | |
| 13 | 5 | 0.7% | |
| 14 | 11 | 1.6% | |
| 15 | 16 | 2.3% | |
| 16 | 11 | 1.6% | |
| 17 | 10 | 1.5% | |
| 18 | 18 | 2.6% | |
| 19 | 8 | 1.2% |
| Value | Count | Frequency (%) | |
| 2000 | 1 | 0.1% | |
| 1680 | 2 | 0.3% | |
| 1630 | 2 | 0.3% | |
| 1350 | 1 | 0.1% | |
| 1250 | 2 | 0.3% | |
| 950 | 1 | 0.1% | |
| 875 | 4 | 0.6% | |
| 790 | 1 | 0.1% | |
| 779 | 1 | 0.1% | |
| 622 | 1 | 0.1% |
SgotAspartateAminotransferase
Real number (ℝ≥0)
| Distinct count | 177 |
|---|---|
| Unique (%) | 25.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 116.1798245614035 |
|---|---|
| Minimum | 10 |
| Maximum | 4929 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.4 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 15 |
| Q1 | 25 |
| median | 43 |
| Q3 | 88.25 |
| 95-th percentile | 499.55 |
| Maximum | 4929 |
| Range | 4919 |
| Interquartile range (IQR) | 63.25 |
Descriptive statistics
| Standard deviation | 281.9611483 |
|---|---|
| Coefficient of variation (CV) | 2.426937287 |
| Kurtosis | 141.0984284 |
| Mean | 116.1798246 |
| Median Absolute Deviation (MAD) | 22 |
| Skewness | 9.831824802 |
| Sum | 79467 |
| Variance | 79502.08914 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 23 | 20 | 2.9% | |
| 30 | 16 | 2.3% | |
| 28 | 16 | 2.3% | |
| 24 | 15 | 2.2% | |
| 21 | 15 | 2.2% | |
| 20 | 15 | 2.2% | |
| 22 | 14 | 2.0% | |
| 34 | 14 | 2.0% | |
| 25 | 13 | 1.9% | |
| 19 | 13 | 1.9% | |
| 32 | 12 | 1.8% | |
| 29 | 12 | 1.8% | |
| 58 | 12 | 1.8% | |
| 15 | 12 | 1.8% | |
| 18 | 11 | 1.6% | |
| 40 | 11 | 1.6% | |
| 16 | 11 | 1.6% | |
| 26 | 11 | 1.6% | |
| 17 | 10 | 1.5% | |
| 14 | 10 | 1.5% | |
| 43 | 9 | 1.3% | |
| 27 | 9 | 1.3% | |
| 42 | 9 | 1.3% | |
| 41 | 9 | 1.3% | |
| 31 | 9 | 1.3% | |
| Other values (152) | 376 | 55.0% |
| Value | Count | Frequency (%) | |
| 10 | 1 | 0.1% | |
| 11 | 3 | 0.4% | |
| 12 | 7 | 1.0% | |
| 13 | 3 | 0.4% | |
| 14 | 10 | 1.5% | |
| 15 | 12 | 1.8% | |
| 16 | 11 | 1.6% | |
| 17 | 10 | 1.5% | |
| 18 | 11 | 1.6% | |
| 19 | 13 | 1.9% |
| Value | Count | Frequency (%) | |
| 4929 | 1 | 0.1% | |
| 2946 | 1 | 0.1% | |
| 1600 | 1 | 0.1% | |
| 1500 | 1 | 0.1% | |
| 1050 | 2 | 0.3% | |
| 960 | 2 | 0.3% | |
| 950 | 2 | 0.3% | |
| 850 | 7 | 1.0% | |
| 844 | 1 | 0.1% | |
| 794 | 2 | 0.3% |
TotalProtiens
Real number (ℝ≥0)
| Distinct count | 58 |
|---|---|
| Unique (%) | 8.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.433187134502925 |
|---|---|
| Minimum | 2.7 |
| Maximum | 9.6 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.4 KiB |
Quantile statistics
| Minimum | 2.7 |
|---|---|
| 5-th percentile | 4.6 |
| Q1 | 5.7 |
| median | 6.5 |
| Q3 | 7.2 |
| 95-th percentile | 8.085 |
| Maximum | 9.6 |
| Range | 6.9 |
| Interquartile range (IQR) | 1.5 |
Descriptive statistics
| Standard deviation | 1.081344976 |
|---|---|
| Coefficient of variation (CV) | 0.1680885312 |
| Kurtosis | 0.1222329834 |
| Mean | 6.433187135 |
| Median Absolute Deviation (MAD) | 0.7 |
| Skewness | -0.2768852305 |
| Sum | 4400.3 |
| Variance | 1.169306958 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 7 | 38 | 5.6% | |
| 6.8 | 37 | 5.4% | |
| 6 | 35 | 5.1% | |
| 6.2 | 28 | 4.1% | |
| 6.9 | 27 | 3.9% | |
| 7.2 | 26 | 3.8% | |
| 7.1 | 24 | 3.5% | |
| 7.3 | 22 | 3.2% | |
| 6.4 | 21 | 3.1% | |
| 5.5 | 21 | 3.1% | |
| 8 | 21 | 3.1% | |
| 5.6 | 21 | 3.1% | |
| 5.8 | 19 | 2.8% | |
| 6.1 | 19 | 2.8% | |
| 6.6 | 18 | 2.6% | |
| 7.5 | 18 | 2.6% | |
| 6.7 | 17 | 2.5% | |
| 6.5 | 17 | 2.5% | |
| 6.3 | 17 | 2.5% | |
| 5.9 | 16 | 2.3% | |
| 5.2 | 16 | 2.3% | |
| 5.4 | 15 | 2.2% | |
| 7.4 | 15 | 2.2% | |
| 7.9 | 14 | 2.0% | |
| 5 | 13 | 1.9% | |
| Other values (33) | 149 | 21.8% |
| Value | Count | Frequency (%) | |
| 2.7 | 1 | 0.1% | |
| 2.8 | 1 | 0.1% | |
| 3 | 1 | 0.1% | |
| 3.6 | 3 | 0.4% | |
| 3.7 | 2 | 0.3% | |
| 3.8 | 2 | 0.3% | |
| 3.9 | 3 | 0.4% | |
| 4 | 3 | 0.4% | |
| 4.1 | 2 | 0.3% | |
| 4.3 | 5 | 0.7% |
| Value | Count | Frequency (%) | |
| 9.6 | 1 | 0.1% | |
| 9.5 | 1 | 0.1% | |
| 9.2 | 2 | 0.3% | |
| 8.9 | 1 | 0.1% | |
| 8.7 | 1 | 0.1% | |
| 8.6 | 3 | 0.4% | |
| 8.5 | 5 | 0.7% | |
| 8.4 | 3 | 0.4% | |
| 8.3 | 3 | 0.4% | |
| 8.2 | 8 | 1.2% |
Albumin
Real number (ℝ≥0)
| Distinct count | 40 |
|---|---|
| Unique (%) | 5.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.1178362573099414 |
|---|---|
| Minimum | 0.9 |
| Maximum | 5.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.4 KiB |
Quantile statistics
| Minimum | 0.9 |
|---|---|
| 5-th percentile | 1.8 |
| Q1 | 2.6 |
| median | 3.1 |
| Q3 | 3.7 |
| 95-th percentile | 4.3 |
| Maximum | 5.5 |
| Range | 4.6 |
| Interquartile range (IQR) | 1.1 |
Descriptive statistics
| Standard deviation | 0.7835258352 |
|---|---|
| Coefficient of variation (CV) | 0.2513043568 |
| Kurtosis | -0.3794806307 |
| Mean | 3.117836257 |
| Median Absolute Deviation (MAD) | 0.6 |
| Skewness | -0.04155474107 |
| Sum | 2132.6 |
| Variance | 0.6139127345 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3 | 56 | 8.2% | |
| 4 | 39 | 5.7% | |
| 2.9 | 34 | 5.0% | |
| 3.2 | 33 | 4.8% | |
| 3.1 | 33 | 4.8% | |
| 3.9 | 30 | 4.4% | |
| 2.7 | 29 | 4.2% | |
| 2.5 | 28 | 4.1% | |
| 3.3 | 27 | 3.9% | |
| 2 | 26 | 3.8% | |
| 3.5 | 26 | 3.8% | |
| 3.4 | 25 | 3.7% | |
| 2.6 | 24 | 3.5% | |
| 3.6 | 24 | 3.5% | |
| 3.7 | 24 | 3.5% | |
| 2.8 | 20 | 2.9% | |
| 2.4 | 19 | 2.8% | |
| 4.1 | 18 | 2.6% | |
| 3.8 | 17 | 2.5% | |
| 2.1 | 16 | 2.3% | |
| 2.3 | 15 | 2.2% | |
| 4.3 | 15 | 2.2% | |
| 1.8 | 15 | 2.2% | |
| 2.2 | 14 | 2.0% | |
| 4.2 | 12 | 1.8% | |
| Other values (15) | 65 | 9.5% |
| Value | Count | Frequency (%) | |
| 0.9 | 2 | 0.3% | |
| 1 | 1 | 0.1% | |
| 1.4 | 3 | 0.4% | |
| 1.5 | 4 | 0.6% | |
| 1.6 | 11 | 1.6% | |
| 1.7 | 4 | 0.6% | |
| 1.8 | 15 | 2.2% | |
| 1.9 | 8 | 1.2% | |
| 2 | 26 | 3.8% | |
| 2.1 | 16 | 2.3% |
| Value | Count | Frequency (%) | |
| 5.5 | 2 | 0.3% | |
| 5 | 1 | 0.1% | |
| 4.9 | 4 | 0.6% | |
| 4.8 | 2 | 0.3% | |
| 4.7 | 3 | 0.4% | |
| 4.6 | 4 | 0.6% | |
| 4.5 | 6 | 0.9% | |
| 4.4 | 10 | 1.5% | |
| 4.3 | 15 | 2.2% | |
| 4.2 | 12 | 1.8% |
Selector
Categorical
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.4 KiB |
| 1 | |
|---|---|
| 2 |
| Value | Count | Frequency (%) | |
| 1 | 493 | 72.1% | |
| 2 | 191 | 27.9% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 1 | 493 | 72.1% | |
| 2 | 191 | 27.9% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 684 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 1 | 493 | 72.1% | |
| 2 | 191 | 27.9% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 684 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 1 | 493 | 72.1% | |
| 2 | 191 | 27.9% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 684 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 1 | 493 | 72.1% | |
| 2 | 191 | 27.9% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| Id | Age | Gender | TotalBilirubin | DirectBilirubin | AlkphosAlkalinePhosphotase | SgptAlamineAminotransferase | SgotAspartateAminotransferase | TotalProtiens | Albumin | AlbuminGlobulinRatio | Selector | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 65 | Female | 0.7 | 0.1 | 187 | 16 | 18 | 6.8 | 3.3 | 0.9 | 1 |
| 1 | 2 | 62 | Male | 10.9 | 5.5 | 699 | 64 | 100 | 7.5 | 3.2 | 0.74 | 1 |
| 2 | 3 | 62 | Male | 7.3 | 4.1 | 490 | 60 | 68 | 7.0 | 3.3 | 0.89 | 1 |
| 3 | 4 | 58 | Male | 1 | 0.4 | 182 | 14 | 20 | 6.8 | 3.4 | 1 | 1 |
| 4 | 5 | 72 | Male | 3.9 | 2 | 195 | 27 | 59 | 7.3 | 2.4 | 0.4 | 1 |
| 5 | 6 | 46 | Male | 1.8 | 0.7 | 208 | 19 | 14 | 7.6 | 4.4 | 1.3 | 1 |
| 6 | 7 | 26 | Female | 0.9 | 0.2 | 154 | 16 | 12 | 7.0 | 3.5 | 1 | 1 |
| 7 | 8 | 29 | Female | 0.9 | 0.3 | 202 | 14 | 11 | 6.7 | 3.6 | 1.1 | 1 |
| 8 | 9 | 17 | Male | 0.9 | 0.3 | 202 | 22 | 19 | 7.4 | 4.1 | 1.2 | 2 |
| 9 | 10 | 55 | Male | 0.7 | 0.2 | 290 | 53 | 58 | 6.8 | 3.4 | 1 | 1 |
Last rows
| Id | Age | Gender | TotalBilirubin | DirectBilirubin | AlkphosAlkalinePhosphotase | SgptAlamineAminotransferase | SgotAspartateAminotransferase | TotalProtiens | Albumin | AlbuminGlobulinRatio | Selector | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 674 | 675 | 32 | Male | 3.7 | 1.6 | 612 | 50 | 88 | 6.2 | 1.9 | 0.4 | 1 |
| 675 | 676 | 32 | Male | 12.1 | 6 | 515 | 48 | 92 | 6.6 | 2.4 | 0.5 | 1 |
| 676 | 677 | 32 | Male | 25 | 13.7 | 560 | 41 | 88 | 7.9 | 2.5 | 2.5 | 1 |
| 677 | 678 | 32 | Male | 15 | 8.2 | 289 | 58 | 80 | 5.3 | 2.2 | 0.7 | 1 |
| 678 | 679 | 32 | Male | 12.7 | 8.4 | 190 | 28 | 47 | 5.4 | 2.6 | 0.9 | 1 |
| 679 | 680 | 60 | Male | 0.5 | 0.1 | 500 | 20 | 34 | 5.9 | 1.6 | 0.37 | 2 |
| 680 | 681 | 40 | Male | 0.6 | 0.1 | 98 | 35 | 31 | 6.0 | 3.2 | 1.1 | 1 |
| 681 | 682 | 52 | Male | 0.8 | 0.2 | 245 | 48 | 49 | 6.4 | 3.2 | 1 | 1 |
| 682 | 683 | 31 | Male | 1.3 | 0.5 | 184 | 29 | 32 | 6.8 | 3.4 | 1 | 1 |
| 683 | 684 | 38 | Male | 1 | 0.3 | 216 | 21 | 24 | 7.3 | 4.4 | 1.5 | 2 |